CUDA 2D Stencil Computations for the Jacobi Method

نویسندگان

  • José M. Cecilia
  • José M. García
  • Manuel Ujaldon
چکیده

This paper explores stencil operations in CUDA to optimize on GPUs the Jacobi method for solving Laplace’s differential equation. The code keeps constant the access pattern through a large number of loop iterations, that way being representative of a wide set of iterative linear algebra algorithms. Optimizations are focused on data parallelism, threads deployment and the GPU memory hierarchy, whose management is explicit by the CUDA programmer. Experimental results are shown on Nvidia Teslas C870 and C1060 GPUs and compared to a counterpart version optimized on a quadcore Intel CPU. The speed-up factor for our set of GPU optimizations reaches 3-4x and the execution times defeat those of the CPU by a wide margin, also showing great scalability when moving towards a more sophisticated GPU architecture and/or more demanding problem sizes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Level Programming of Stencil Computations on Multi-GPU Systems Using the SkelCL Library

The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA. This makes development of stencil applications a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high...

متن کامل

Extending the SkelCL Skeleton Library for Stencil Computations on Multi-GPU Systems

The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA, which makes it a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high level of programming abstractio...

متن کامل

A stencil of the finite-difference method for the 2D convection diffusion equation and its new iterative scheme

The paper gives the numerical stencil for the two-dimensional convection diffusion equation and the technique of elimination, and builds up the new iterative scheme to solve the implicit difference equation. The scheme’s convergence and its higher rate of convergence than the Jacobi iteration are proved. And the numerical example indicates that the new scheme has the same parallelism and a high...

متن کامل

High Performance Stencil Code Algorithms for GPGPUs

In this paper we investigate how stencil computations can be implemented on state-of-the-art general purpose graphics processing units (GPGPUs). Stencil codes can be found at the core of many numerical solvers and physical simulation codes and are therefore of particular interest to scientific computing research. GPGPUs have gained a lot of attention recently because of their superior floating ...

متن کامل

Optimizing Transformations of Stencil Operations for Parallel Cache-based Architectures

This paper describes a new technique for optimizing serial and parallel stencil-and stencil-like operations for cache-based architectures. This technique takes advantage of the semantic knowledge implicitly in stencil-like computations. The technique is implemented as a source-to-source program transformation; because of its speci-city it could not be expected of a conventional compiler. Empiri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010